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EVANS , Thomas C. 

TITLE OF INVENTION: Intein Mediated Peptide Ligation 
FILE REFERENCE: NEB - 1 50PUS 
CURRENT APPLICATION NUMBER: 09/786,003 
<141> CURRENT FILING DATE: 2001-04-17 

PRIOR APPLICATION NUMBER: 60/102,413 
PFIOR FILING DATE: 1999-09-30 
PRIOR APPLICATION NUMBER: PCT/US 9 9/2 2 7 7 6 
PRIOR FILING DATE: 1999-09-30 
NUMBER OF SEQ ID NOS : 9 
SOFTWARE: Patent In Ver . 2.0 
SEQ ID NO: 1 
LENGTH: 4 3 
TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: the modified 
o - * e rmi r id I spli:c junction of t.he intein from the 
gyrA gene of Mycobacterium xenopi 



4 3 



SEQ ID NO: 2 
LENGTH: 3 9 
TYPE: DNA 

ORGANISM: Artificial Sequence 



complementary strand of the C-terminal splice y 
junction of the modified intein from the gyrA { S 
gene ~>f Mycobacterium xenopi 
SEQUENCE: 2 



3 9 



■ : Fm TO NO: 3 
LENGTH: 6 6 
TYPE: DNA 

' ? " z A N I 5M : A r *" 1 1 : i a 1 S e q ■ i e n c '=* 
FEATURE : 

. .' • OTHER T N FORMAT ION : Description of Artificial Sequence: the polylinker 
se^.ience inserted upstie^m ol the mcaili-'C i;/ rii, 
: . - ^ ivrA a^ne of Mycobacterium xenopi 
■:4 ■'.'■.» • SEQUENCE : 3 

: : ; .i .i t : l . -h z a ca *~ ~. t ggcc ^taoafaa:a qccqcctcga gggctct.tcc tgcatcacgg c 0 

■ '.: i ") • SE 2 ID NO : 4 
<2 1 I - LENGTH : 6 9 
.2 1.2.- TYPE : DNA 
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6 7 < 




13> 


-ORGANISM: Artificial Sequence 






6 9 • 




20> 


FEATURE : 






7 0 • 




23:- 


OTHER INFORMATION: At position 41, "H" = A or C or T . 






72 • 




2 0> 


FEATURE : 






7< ■ 




23> 


OTHER INFORMATION: Description of Artificial Sequence: 


the^- 




74 






complementary strand of the polylinker inserted 






7 5 






upstream of the modified intein from the gyrA 






7 0 






gene of Mycobacterium xenopi 






7o • 




0 0> 


SEQUENCE: 4 






7 , 1 




■jgt 


gcatc teccgtgatg caggaagagc cctcgaggcg hgc:g:cacc catgg 


:cata 


6 0 


3'.' t 




t c t 


a g a t 




6 9 


82 - 




0. > 


5EQ ID NO: 5 






8 ? * 




11> 


LENGTH: 6 50 9 






84 • 




2 . " 


TYPE: DNA 






8 0 - 




i.3;> 


ORGANISM: Artificial Sequence 






87 - 




2 0> 


FEATURE : 




L- 


88 - 




2 3> 


OTHER INFORMATION: Description of Artificial Sequence: 


pTXBl 


p>lasmid 


8 9 






sequence containing the modified intein from the 






Q 0' 






gyrA gene of Mycobacterium xenopi 






9 2 


4 0 0> 


SEQUENCE: 5 






9_*> 




eg tea ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttattttt 


6 0 








tacat t :aaa^argt atccgctcat gagacaataa ccctgataaa tg-tt 


:aata 


12" 






.+-■*■ 


gi.na a jgaagagta tga^tatt ?a acatttccgt gtrgecctta ttccc 


ttttt 


1 8 0 



98 

9 9 

1 0 0 

1 o: 

102 
1 0 3 



tgcggcattt tgccttcctg tttttgetea cccagaaajg ^tj 
tgaagatcag ttgggtgcac gagtgggtta catcgaactg gat 
ccttgagagt tttcgccccg aagaacgttc tccaatgatg age 
atgtggcgcg gtattatece gtgttgacgc egggcaagag eaa 



1- • 



xx 

lJU 



ct.at; tctcag 
catgacagta 
c + ;tacttctg 
ggat z atgta 
cqagcgtgae 
cga a c tactt 
tgeaggae ca 
ageeggtgag 
c :gtar egta 
gatcget gag 
a *.a :a:a :tt t 
at tg t a taag 
t ^ ttg t taaa 
t :aaa a g 3 a t 
t. ' aaa ga ac g 
r .i .' ■ x < .i .< 
jggaa :::ta 

aegc t gcgcg 
gat. :\ j 
g t teea etga 
t -tgzrgcgta 



aatgacttgg 
agagaattat 
acaacga t eg 
actcgccttg 
ae -a -gatge 
actctagctt 
cttctgeget 
egt gggt ct e 
gttatctaca 
ataggtgect 
t a ga -1- 1 gatt 
;:aa a t attt a 
tcagctcatt 
a ^ z z z [ X a g a ^ 
t ggac r ccaa 
~ a T a "*a a 
aagggagc — 
.: ;a a i i iq^ 
taa:cac:ac 
.,..gatc— rr 
gcgt cagacc 
at :t.g:tgct 



ttgagtac tc 
gcagtgetgc 
gaggaecgaa 
a tcgttggga 
ct gtage a at. 
cccggcaaca 
e g g c c c 1 1 c c 
geggtatcat 
egaeggggag 
ca ctga z taa 
tac cccggtt 
aattgta aae 
t r. t taaccaa 
agiatt ga gf 
cq teaaaggg 
■■i r c a a g t t. 1. 1 
z :g a t : '..^ga 
'"ra ^ aarjaqcq 

^aataatct 
ccgtagaaaa 
tgcaaacaaa 



aecagtcaca 
cataa ccatg 
ggagctaace 
aceggagctg 
ggeaaeaaeg 
at t. a atagac 
ggctggctgg 
tgcagcaetg 
tcaggcaac t 
gc at. tggtaa 
gat aateaga 
gt r.aata z z i 
tagg rcgaaa 
gt tgttcrag 
eg a a aaa z zq 
" r 99 ?gt :ga 



ggcg - taggg eg rt gg ea ag 




eatg a ecaaa 
gate aaagga 
aaaaceaceg 



atcccttaac 



c t a e • 



:agegg 



t aaaaga 
g egg taa gat 
aagt tetget 
gecgeataca 
ttaeggatgg 
ctgcggceaa 
a eaacatggg 
t a c c a a a c g a 
t. a 1 t a a e t g g 
egga taa agt 
at.aaatc tgg 
gtaagecetc 
gaaa tag a ?a 
aagttta etc 
aaa :aggaag 
rgcgt t a aat 
:c rt *" a t aaa 
gag t zca zz.i 

ageaet a aat. 
ga =3 M-.ieq 
tg t ageggtc 

g t g i g t 1 1 1 
a t e r t +■ t +" t- 1 
tggtt.tgt.tt 



Jr." 

4H0 
04 0 

0 ' j o 

0 6 0 
7 2 (j 

7 * 0 

8 4 0 

9 0 0 
9 0 
l'-2 
l'-8 

12 0 

1 *2 
1 d 



1 r H 

174 
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12 J 

i: 3 

U4 

12* 
12 / 
1 2 8 

12 9 
130 

13 1 
1 3 2 
1 3 3 
134 
135 
136 
137 
138 
1 3 9 

14 0 

14 : 

14 2 

1 4 4 
14 0 
14 
147 

14 6 

14 9 
15'"- 
151 
1 5 2 

15 3 
154 
IV. 5 
10- 

i :-. - 

153 



1 - T 
1-..H 
1 - -j 

1 7-.j 



gccggat caa 
a-:caaatact 
aocgcctaca 
g tcgtgtctt 
ctgaacgggg 
a tacctacag 
gtatc cggta 
cgcctggtat 
gtgargctcg 
gt tcctggcc 
tgtggata a c 
cgagcgcagc 
atgccgoata 
gc cccgacac 
cgettacaga 
atcaccgaaa 
acagatgtct 
c tggcttctg 
cctccgtgta 
a tgetcaoga 
a a aca actgg 
"gaa cq :ca 
t tctcgc cga 
a t.tccga ata 
c eg aaaatga 
gt.cataagtg 
a a gg ct ctca 
attgcgttgc 
t.gaatcggc e 
tt t caccagt 
cagoaagcgg 
cggegggata 
accaacgcgc 
gg caa -cage 
ac eggacatg 
gaga tat tta 
taaeagcgcg 
gt :rr :atq 
cgccggaac 
gt taa tgat 
ttcgacg cc 
agr ^ taat. 
g j ca a: c ag 
cagct ::gc 



cgtt a ctggt 
accgegaaag 
a :tcctgca t 
ggaatggtgc 



gagctaccaa 
gtcctt ct ag 
tacctcg etc 
accgggttgg 
ggttcgtgca 
cgtgagctat 
ageggcaggg 
ctttatagtc 
tcaggggggc 
ttttgctggc 
cgtat taccg 
gagtcagtga 
gttaagccag 
ccgccaaca c 
caagctgtga 
cgcgcgaggc 
gcctgttcat 
ataaageggg 
agggggaatt 
tacgggtt.a c 
cggta tgg a t 
gc.aaga eg ta 
a a ;g^ t ggt 
ccgcaagcga 
cecagagcgc 
eggegacgat 
agggcategg 
gctcactgc c 
aacgegeggg 
gagaegggea 
tccacgctgg 
taacat.gagc 
agcccggact 
at cgc agtgg 
gcactccagt 
tgc cage cag 
a t ttgc t. ggt. 
gagaa a a 4 : aa 
ttagtgcagg 
agcccactga 
:t:* ::a 
gccgcgacaa 
j i .'jar' g * *r 
at eg-— get t 

ttcaca tt ea 
gtt: t g eg - 
taggaagcag 
atgccgccct 



ctctttttcc 
tgtagccgta 
tgetaatect 
actcaagacg 
ca cagcccag 
gagaaagege 
teggaa cagg 
ctgtcgggtt 
ggagcctatg 
ettttgetea 
cctttgagtg 
gcgaggaage 
tatacactcc 
e cgctgacgc 
ccgtctccgg 
agetgeggta 
ccgcgtc cag 
ccatgttaag 
tctgttcatg 
tgatgatgaa 
geggeggga - 
g-coagegcg 
gaog gga::a 
caggecgat e 
t.gccggcacc 
agtcatgccc 
tcgagatc c c 
cgctttccag 
gagaggeggt 
acagctgatt 
tttgccccag 
tgtcttcggt 
eggtaatgge 
gaa egatgee 
egcet t e e eg 
eeagacgcag 
gacee a atge 
tactgt tgat 
cag cttccac 
cgcgttgegc 
c : a t c g a e a e 
r tg egaegg 
- c:a^ 
c - act tt t l c 
" ■^ '7 3'- a aaa 
e :a cc :tgaa 
it ^g a^agt 
eecagtagta 
tt egtctt ca 



gaaggtaaet 
gttaggecac 
gttaccagtg 
atagttaccg 
ettggagega 
cacgcttc cc 
agagegcacg 
tcgccacctc 
gaaaaacgee 
catgttcttt 
agctgatacc 
ta tggtgcac 
getategcta 
gecctgaegg 
gagctgeatg 
aagctcatca 
etcgttgagt 
ggcggttttt 
ggggtaatga 
catgcccggt 
ea gaga a a a a 
tcggcegcca 
gt-jacgaaqg 
ategtcgeg j 
tgtc etacga 
cgcgcccaec 
ggtgcctaat 
tegggaaa cc 
ttgcgtattg 
gcccttcacc 
caggegaaaa 
at egtegtat 
gcgcattgcg 
ctcattcag c 
ttcegctat e 
a cgcgccgag 
ga ccagatg e 
gggt.gtct.gg 
age a a tggca 
g a g a a g a 1 1 g 
-accacgctg 
cgcgtgcagg 
tt g-tgtgcc 
• :g - : t- - 
qa cacc gg ca 
tt gact. . : e 
crt cecggat e 
ggttgaggcc 
agaattaatt 



gg ct t cag c a 
ca ettcaaga 
getgetgeea 
gataaggege 
acgacetaca 
gaagggagaa 
agggagctte 
tgacttgage 
ageaacgegg 
ectgegttat 
gctcgccgca 
te tcagtaca 
cgtgact.ggg 
get tgtctge 
tgtcagaggt 
gcgtggtcgt 
ttctecagaa 
tcctgtttgg 
ta ccgatgaa 
tactggaacg 
tcactcaggg 
tgc eggega t 
t gaq-qag 
t :cagcga 5 a 
gttgeatgat 
ggaaggagct 
gagtgagcta 
tgtcgtgcca 
ggcgccaggg 
gc ctggcect 
tcetgtttga 
cc eactaccg 
cccagcgcca 
atttgeatgg 
gget gaat.tt. 
acagaa roa 
tc caegccca 
tea gaga ca t 
l .cc tggtcat 
tgcaocgccg 
geaceeagtt 
g:r,ga:- g- 
a eg rgg^tgg 

tactc tgega 

' " ' ' ! * ' ; 
tegaegctet 
gctgagea c ■ 
c:caatt::a 



gagegcagat 
actc tgtagc 
gtggegat aa 
ageggteggg 
ccgaa ctgag 
aggeggacag 
eagggggaaa 
gtcgattttt 
c ctttttacg 
c ccctgattc 
gccgaacgac 
at.otgct.ctg 
teat ggetgc 
tcccggcatc 
tttcaccgtc 
geagegat tc 
gcgtta atgt 
tcacttgatg 
acgagagagg 
ttgtgagggt 
teaatgecag 
a a tgg :ctg e 
gg -gtgcaag 
a -qg ' : ct c ..; 
aaagaaga ea 
gactgggttg 
acttacatta 
ge tgcattaa 
tggtttttct 
gagagagttg 
tggtggttaa 
a gat a tcegc 
retgategtt 
t ttgt tgaaa 
gattgegagt. 
a^gggecegc 
gtcgcgtacc 
raagaaataa 
c^agegqa ta 
:tra caggc 
gateggegeg 
a - j rg?:a a c 
Taat t aa tt 

17^ t.j ;c ^ 7 

c-i teg ta ta a 

' l * T.-.-l t 

c z ct: atg eg 
g e c g e e g ^ a a 
g g e a t e a a a t 



L 9 J 0 
1 9 y o 
2 04 0 
2 1 ( ) '.) 
2 1 0 

2 2 8 
234 
2 4 <j u 
J 4 t '0 
0 2 1 ' 
2 58 0 
204 
2 70 
2 70 0 
2 82 0 
2 8 8 0 

2 94 0' 

3 0 0 0 
3 0 0 ,_1 
312') 
* :-: '■ 
.^■4 ■ ■ 
3 3 0 ; 1 

6 i ' 
3 4 2C 
348 
3 54 
3 0 0 11 
3 0 r. 

/ 7^ 
3 7 y '.■ 

3 84': 
i y '"i .". 
39kO 

4 0 2 : 
4 0 8 \ 

4 1 4 

4 2^, 
4 2- ■" 
4 . 
4 J t i 
4 ■* 4 
4S ■>■> 
4 . r . ■ 
4 r- ^ -J 
4 -3 M n 



file://C:\Crf3\Outhold\VsrI786009.htm 



9/27/01 



Page 4 of 7 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/786,009 



DATE : 09/27/2001 
TIME: 12:11:57 



Input Set : A:\Inteinm.app 

Output Set: N:\CRF3\09^72001\l786009.raw 



171 
172 
173 
174 

17 S 
170 
177 
17H 
17* 
180 
1H1 
162 
1*3 
184 

18 5 
ISO 
187 
188 
18* 
10' 

l:.n 
] . 

1 

1 :)A 

1 0 5 
10- 
107 
108 
1 ,: <0 



aaaa^gaaag gctcagtcga aagactgggc ctttcgtLtt 

cgctctcctg agtaggacaa atccgccggg agcggatttg 

cggagggtgg cgggcaggac gcccgccata aactgccagg 

taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 

acgctctcct gagtaggaca aatccgccgg gagcggattt 

ccggagggtg gcgggcagga cgcccgccat aaactgccag 

ataaaacgaa aggctcagtc gaaagactgg gcctttcgtt 

aacgctctcc tgagtaggac aaatccgccg ggagcggatt 

cccggagggt ggcgggcagg acgcccgcca taaactgcca 

aataaaacga aaggctcagt cgaaagactg ggcctttcgt 

gaacgctctc ctgagtagga caaatccgcc gggagcggat 

gcccggaggg tggcgggcag gacgcccgcc ataaactgcc 

aattcccggt ttaaaccggg gatctcgatc ccgcgaaatt 

aattgtgagc ggataacaat tcccctctag aaataatttt 

tatacatatg gctagctcgc gagtcgacgg cggccgcgaa 

catcacggga gatgcactag ttgccctacc cgagggcgag 

oca acagtgacaa cgccatcgac 

eg accggctgtt ccactccggc 

tgc gtgtgacggg caccgcgaac 
ga 



cgtgccgggt gegegge 
tggcaatccc gtgeteg 
gcgtacggtc gaaggtc 
cgacatcgcc ggggtgc 
ttacgcggtg 
, : :ca:ia + t l 



ar.tcaac 
acq ccca 
a^jdci : :.] :* cgaga ^e 
ctact.acgcg aaagtcg 
r.gtcgacacg gcagacc 



gca 
e a a 

:jq 



ccctgctgtg gaagctgatc 
gcgcattcag cgtcgactgt 
cctacacagt cggcgtccct 



■5cg:::aag^ 
gt gtca c eg a 



.at :g::gd^: 
:gc jgg t 



at rtgttgtt 
aacgtt gega 
aattaattcc 
tatctgttgt 
gaacgttgcg 
gaattaatt c 
ttatctgttg 
tgaacgttgc 
g gaattaatt 
tttatctgtt 
ttgaacgttg 
aggaattggg 
aatacgactc 
gtttaacttt 
ttcotcgagg 
teggtacgea 
ctgaaagtcc 
gagcatcegg 
cac c cgttgt 
gacgaaat ca 
gcaggttttg 
ggaotggtgo 
gag rt.ga ccg 
cag ?cggt g t 
gt. cagccacg 
tccgcttggc 
aegtataaat 
gec ttgtgge 
gaagctgagt 
aaaogggt ct. 



tgt cggtgaa 
agcaacggcc 
aggcatcaaa 
ttgtcggtga 
aagcaaegge 
caggcatcaa 
tttgtcggtg 
gaagcaaegg 
ccaggcatca 
gtttgtcggt. 
cgaagcaacg 
gat eggaat t 
actatagggg 
aagaaggaga 
gctcttcctg 
tegcega cat 
ttgaceggea 
tgtaeacggt 
tgtgtttggt 
ageegggega 
cccgcgggaa 
gtttcttgga 
aegggeggtt. 
atagect ■ eg 
ctactggcct. 
aggtcaacac 
gt ttgeagee 
agctt caatg 
tggctgctgc 
tgaggggttt 



4 74 0 
4 8 00 
4 8 k- 0 
4 92 0 

4 9 8 0 

5 0 4 0 
0100 
516 
52 2 
52 8 

5 4 0 
' 1 0 

: >8 0 

56 4 0 

5 7 u 0 

57 6 0 
5820 
5880 
594 0 

6 000 
6 < ' 6 0 



6 2 4 0 
6 30 0 
6 3 6 0 
6 4 2 0 
6 4 8 0 
6 50 9 



acg cgtttatcac gaacgggt.tc 
cacccrgtctg aactcaggcc tcacgacaaa tcctggtgta 
agcttataot gcgggacaat tggtcacata taacggcaag 
ccacacctcc ttggcaggat gggaaccatc caacgttcct 
aot.goaggaa ggggatcegg ctgctaacaa ageccgaaag 
Cdccgctgag caataactag cataacccct tggggectet 
tttgetgaaa ggaggaacta tatceggat 
0 2 10: • SEQ IE) NO: 6 
• 2 11 * LENGTH : 30 
02 120 TYPE: PRT 

'ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: synthetic 
pepc ide 
SEQ'JENCE: 6 
Ala Tyr Lys Thr Thr Gin Ala Asn Lys His 



<J 13 



.40- 1 



Asn Pr 



:■:> ID NC 



VaL Pro Val Hls 

? 5 



lie lie V^l Ala 
10 15 

30 



al Sequence 



! 2 • TYPE: PRT 

:.. i. 3 > -OR DANISM . Art i 4 

CI :J0 > FEATURE : 

CO.; 3-> OTHER INFORMATION: Description of Artificial Sequence: the amino nc: 
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!27 
!28 
! 30 
: 31 



l 36 
J 37 
l 3 8 
J4 0 
:4 1 
•42 
•43 
J4 5 
M6 
M8 
J4 9 
:50 
J 51 



J :> / 
2 58 

2 d9 



sequence deduced from the polylinker x eg ion of 
pTXBl 
■:4 00'> SEQUENCE: 7 

Met Ala Met Gly Gly Gly Arg Leu Glu Gly Ser Ser Cys 
: 5 10 

210> SEQ ID NO: 8 
211> LENGTH: 4 2 
212> TYPE: DNA 

21 3:. > ORGANISM: Artificial Sequence 
2 2 0:> FEATURE : 

22 3: - OTHER INFORMATION: Description of Artificial Sequence: polylinker 

region upstream of the modified intein from the 

gyrA gene of Mycobacterium xenopi in pTXBl 
4 0 0;- SEQUENCE: 8 



catatggcca tgggtggcgg ccgcctcgag ggctcttcct gc 
210:* SEQ ID NO: 9 
211> LENGTH: 7 
212> TYPE: PRT 

21'<\- ORGANISM: Artificial Sequence 
.:2 0> FEATURE: 

223> OTHER INFORMATION: Description of Artificial Sequence: 

pept ide 
• i • SEQUEN 2F : ^ 
ys Asp Pro Glu Lys Asp Ser 
I 5 



42 



synthetic , 
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